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Type  Reconstruction 
with  First-Class  Polymorphic  Values 

James  William  O’Toole  Jr.* 

David  K.  Gifford^ 


Abstract 

We  present  the  first  type  reconstruction  system  which 
combines  the  implicit  typing  of  ML  with  the  full 
power  of  the  explicitly  typed  second-order  polymorphic 
lambda  calculus.  The  system  will  accept  ML-style  pro¬ 
grams,  explicitly  typed  programs,  and  programs  that 
use  explicit  types  for  all  first-class  polymorphic  values. 
We  accomplish  this  flexibility  by  providing  both  generic 
and  explicitly-quantified  polymorphic  types,  as  well  as 
operators  which  convert  between  these  two  forms  of 
polymorphism.  This  type  reconstruction  system  is  an 
integral  part  of  the  FX-89  programming  language.  We 
present  a  type  reconstruction  algorithm  for  the  system. 
The  type  reconstruction  algorithm  is  proven  sound  and 
complete  with  respect  to  the  formal  typing  rules. 

Categories  and  Subject  Descriptions:  D.l.m  [Pro¬ 
gramming  Techniques]  -  Miscellaneous:  Firsi-Class 
Polymorphism]  D.3.1  [Programming  Languages]  - 
Formal  Definitions  and  Theory;  D.3.3  [Programming 
Languages]  -  Language  Constructs:  Implicit  Typing] 
D.3.4  [Programming  Languages]  -  Processors:  Com¬ 
pilers. 

General  Terms:  Languages,  Type  Theory,  Polymor¬ 
phism. 

Additional  Key  Words  and  Phrases:  type  systems,  effect 
systems,  type  inference,  type  reconstruction,  FX-89. 


1  Combining  Generic  and 
First-Class  Polymorphism 

Type  reconstruction  relieves  the  programmer  of  the  bur¬ 
den  of  providing  type  information  while  retaining  the 
benefits  of  strongly-typed  languages,  including  supe¬ 
rior  performance,  documentation,  and  saifety.  However, 
present  systems  for  type  reconstruction,  such  as  the  ML 
type  system  [Milner78],  do  not  permit  the  use  of  first- 
class  polymorphic  values.  Explicitly-typed  languages, 
such  as  FX-87  [Gifford87],  do  permit  first-class  poly¬ 
morphic  values,  but  they  do  not  provide  the  program¬ 
mer  with  the  convenience  of  implicitly-typed  languages 
such  as  ML. 

The  FX-89  programming  language  is  a  revision  and 
extension  of  FX-87.  FX-89  is  based  on  a  type  recon¬ 
struction  system  that  combines  the  flexibility  of  ML 
with  the  full  typing  ability  of  the  explicitly  typed  second 
order  lambda  calculus.  This  reconstruction  system  will 
accept  ML-style  programs,  explicitly-typed  programs, 
and  programs  that  use  explicit  types  for  all  first-class 
polymorphic  values. 

In  this  paper  we  describe  both  the  theoretical  basis 
of  the  FX-89  type  reconstruction  system  and  our  type 
reconstruction  algorithm.  The  algorithm  described  in 
the  paper  has  been  implemented. 

In  order  to  simplify  our  presentation  we  will  restrict 
our  attention  to  a  simplified  version  of  FX-89  which 
we  will  call  IFX.  The  full  FX-89  language  includes 
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side-effects,  modules,  oneofs,  references,  and  other  data  _ 

types.  These  constructs  can  be  added  to  the  type  infer-  » 
ence  system  described  below. 

In  the  remainder  of  this  paper  we  discuss  the  previous  Q 
work  in  this  area  (Section  2),  introduce  IFX  (Section  3),  Q 
present  a  system  of  type  reconstruction  rules  (Section 
4),  describe  an  algorithm  which  reconstructs  IFX  types  — 
(Section  5),  prove  the  correctness  of  the  algorithm  (Sec¬ 
tion  6),  briefly  discuss  possible  extensions  (Section  7), 

and  conclude  with  some  observations  on  our  result  (Sec-  _ 

tion  8).  es 


Avdil  andjor 
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2  Previous  Work 

[Milner78]  presents  a  typing  system  based  on  type 
schemes  in  which  the  lot  construct  provides  generic 
polymorphism.  ML,  as  presented  in  [Damas82],  uses 
generic  type  variables  to  express  polymorphism.  The 
ML  type  discipline  is  not  as  powerful  as  the  type  disci¬ 
pline  of  the  second-order  polymorphic  lambda  calculus 
[Fortune83].  Type  quantifiers  are  not  explicit  in  ML, 
and  it  is  therefore  not  possible  to  express  the  type  of  a 
function  which  expects  a  polymorphic  value  tis  an  argu¬ 
ment.  For  this  reason,  we  say  that  ML  does  not  provide 
first-class  polymorphism. 

[McCracken84]  introduced  a  type  recor'^truction  sys¬ 
tem  for  the  second-order  polymorphic  lambda  calculus. 
McCracken’s  system  did  not  provide  the  generic  let 
construct  of  ML,  although  it  did  attempt  to  support  the 
automatic  insteintiation  of  explicitly  typed  polymorphic 
functions  in  application  position.  Both  [McCracken84] 
and  [Leivant83]  attempted  to  provide  automatic  type 
abstreiction  in  more  general  type  systems,  but  these  re¬ 
sults  are  flawed  (see  sections  4.5  and  4.6). 

The  general  partial  polymorphic  type  inference  prob¬ 
lem  was  shown  to  be  undecidable  by  [Boehm85].  More 
recent  work  [Kfoury88]  has  shown  that  conservative  ex¬ 
tensions  to  ML  providing  restricted  polymorphism  are 
possible,  but  has  not  provided  a  practical  type  recon¬ 
struction  algorithm.  Recently,  [PfenningSS]  related  the 
complexity  of  the  partial  type  reconstruction  problem 
for  the  second-order  polymorphic  lambda  calculus  to 
that  of  second-order  unification,  which  is  well-known  to 
be  undecidable  [Goldfarb81]. 

We  believe  our  type  reconstruction  method  is  the  first 
to  combine  the  implicit  typing  of  ML  with  the  full  power 
of  the  second-order  polymorphic  lambda  calculus.  We 
accomplish  this  flexibility  by  providing  both  generic  and 
explicitly-quantified  polymorphic  types,  as  well  as  op¬ 
erators  which  convert  between  these  two  forms  of  poly¬ 
morphism.  The  difficulty  of  second-order  unification  is 
avoided  via  syntactic  restrictions  that  define  the  types 
which  may  be  omitted  by  the  programmer. 

Below  are  some  examples  which  illustrate  the  relative 
power  of  the  ML  typing  system,  McCracken’s  typing 
system  (MTS),  and  our  system  (IFX).  The  let  binding 
^  construct  of  ML  permits  generic  type  abstraction  and 
•  instantiation  of  twice  and  id: 


(let  ((twice  (lambda  (1  x)  (f  (f  x)))) 
(id  (lambda  (x)  x))) 

(cons  (twice  id  0)  (twice  id  true))) 


This  example  is  not  well-typed  in  MTS  because  no 
generic  polymorphism  is  provided.  This  example  is  well- 
typed  in  ML  because  the  variables  twice  and  id  are 
assigned  generic  types,  and  these  generic  types  are  au¬ 
tomatically  instantiated  as  necessary.  In  general,  ML 
programs  cannot  be  typed  by  MTS  without  the  addi¬ 
tion  of  extensive  explicit  type  abstraction  and  instan¬ 
tiation  information.  The  explicit  typing  of  MTS,  and 
the  implicit  instantiation  of  the  functions  cons  and  f , 
permit: 

(lambda  (f  :  Vt. (--►<) 

(cons  (i  0)  (1  true))) 

The  second  example  is  not  well-typed  in  ML,  because 
the  lambda^bound  variable  t  must  have  two  incompat¬ 
ible  types  within  the  body  of  the  lambda.  In  ML,  A 
lambdarbound  variable  cannot  be  jpven  a  generic  type 
within  the  body  of  the  lambda  because  the  type  lan¬ 
guage  is  not  capable  of  expressing  tbe  resulting  function 
type  of  the  lambda,  which  must  contain  an  explicit  type 
quantifier.  First-class  polymorphism  allows  the  variable 
1  to  be  assigned  an  explicitly  quantified  polymorphic 
type. 

Our  system  permits  both  of  the  above  examples.  Mc¬ 
Cracken  introduced  the  close  operator  to  allow  the  pro¬ 
grammer  to  indicate  where  type  abstraction  should  oc¬ 
cur  without  having  to  specify  precisely  what  those  type 
abstractions  should  be.  A  discussion  of  tbe  formal  typ¬ 
ing  rule  for  the  close  operator  and  why  our  modifica¬ 
tion  to  the  rule  in  [McCracken84]  is  necessary  appears 
in  sections  4.5  and  4.6. 

Unlike  McCracken’s  system,  our  typ>e  system  contains 
both  ML-style  generic  types  and  explicitly  quantified 
types,  and  we  therefore  require  that  the  programmer 
,.a«cate  where  explicit  quantifiers  should  be  removed 
i.  .j.  a  type.  The  following  example  illustrates  the  use 
of  both  explicit  and  generic  polymorphism: 

(lambda  (g  :  Vt.l— ►() 

(let  ((twice  (leunbda  (f  x) 

(1  a  x)))) 

(g  (open  g))) 

(cons  (twice  g  0) 

(twice  g  true)))) 

The  open  operator  is  used  to  convert  m  explicitly 
quantified  polymorphic  type  into  ar  ML-style  generic 
polymorphic  type,  and  close  is  used  to  make  the  op¬ 
posite  conversion. 
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In  sum,  the  major  advantages  of  our  type  reconstruc¬ 
tion  system  are: 

•  Programmers  may  write  without  using  explicit  type 
specifications.  ML-style  programs  may  be  used 
without  modification  in  our  type  system. 

•  Programmers  may  write  using  fully  explicit  type 
specifications.  Explicitly-typed  programs  may  be 
used  without  modification  in  our  type  system. 

•  Programmers  may  use  explicit  types  where  desir¬ 
able  for  documentation  or  other  purposes,  and  omit 
them  where  they  would  decrease  readability. 

•  First-class  polymorphic  values  can  be  used,  pro¬ 
vided  that  their  types  are  declared.  Thus,  modules 
can  be  first-class  values  in  our  system.  Program¬ 
mers  may  use  the  open  and  close  operators  to  sim¬ 
plify  the  use  of  first-class  polymorphic  values. 

3  IFX:  A  Typed  Language 

For  pedagogical  purposes  we  will  study  type  reconstruc¬ 
tion  for  IFX,  a  simplified  version  of  FX-89.  FX-89  is  a 
polymorphic  typed  language  that  allows  side-effects  and 
first-class  functions.  Its  syntax  and  most  of  its  standard 
operations  are  strongly  inspired  by  Scheme  [Reesdd]. 
The  language  IFX  has  the  following  Type  and  Expres¬ 
sion  domains  (where  I  is  the  domain  of  identifiers,  and 
are  the  primitive  types): 

t  :  I  : :  =  Identifiers 

X  :  P  :  Primitive  types 

u  :  U  : :  =  P  primitive  type 

I  type  identifier 

U  — ♦  U  function 


Vli...I„.U  polymorphic  type 

I 

variable 

(lambda  (I)  E) 

lambda 

(lambda  (I  :  U) 

E) 

abstraction 

(E  E) 

application 

(let  (I  E)  E) 

generic-let 

(plambda  (I . . . ) 

E) 

type 

abstraction 

(close  E) 

type 

closure 

(proj  E  U. . . ) 

projection 

(open  E) 

automatic 

projection 

The  type  domain  U  contains  the  types  which  are  sup¬ 
plied  by  the  programmer  in  explicit  type  declarations. 


The  type  of  a  function  encodes  the  type  of  its  argument 
and  its  result.  If  the  type  of  the  argument  is  monomor- 
phic,  then  it  may  be  omitted.  The  type  'it.v  represents 
the  type  of  polymorphic  values  abstracted  over  the  type 
parameter  t. 

In  the  expression  domain,  just  as  lambda  abstracts 
E  over  the  ordinary  variable  I  of  type  U,  plambda  ab¬ 
stracts  E  over  the  type  identifier  I  to  yield  a  polymor¬ 
phic  value.  A  polymorphic  value  is  instantiated  with 
the  proj  construct.  The  close  and  open  constructs 
provide  automatic  type  abstraction  and  instantiation. 
As  an  example,  we  give  below  the  code  of  the  poly¬ 
morphic  compose  function  that  composes  the  function 
f  with  the  function  g: 

(close 

(lambda  (1) 

(lambda  (g) 

(lambda  (x) 

(g  (1  x)))))) 

Note  that  compose  is  automatically  abstracted  over 
the  argument  and  result  types  of  the  functions  1  and  g. 
The  type  of  compose  is: 

compose  :  V<sr.(<  — »  s)  — ♦  (s  — *  r)  — ►  t  — ►  r 

The  types  of  1,  g,  and  x  were  omitted  from  the  program 
text,  but  could  equally  well  have  been  included  by  the 
programmer. 

4  Type  Reconstruction  System 

We  present  the  typing  system  of  IFX  as  as  formal  de¬ 
duction  system  consisting  of  a  set  of  type  reconstruction 
rules.  The  type  system  contains  generic  (i.e.  general) 
type  variables,  and  distinguishes  between  these  generic 
type  variables  and  the  type  identifiers  which  appear  in 
user-supplied  types.  The  type  system  also  distinguishes 
between  monomorphic  and  polymorphic  types: 

o  :  G  :  :  =  General  type  variables 

/i  :  M  : :  =  P  primitive  type 

I  type  identifier 

G  general  type  variable 

M  — ►  M  function 

T  :  T  : :  =  P  primitive  type 

I  type  identifier 

G  general  type  variable 

T  — >  T  function 

Vli...I„.T  polymorphic  type 
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The  IFX  typing  rules  make  use  of  an  important  dis¬ 
tinction  between  the  M  and  T  type  domains.  The  rules 
are  designed  so  that  M  types  may  be  omitted  from 
formal  argument  type  declcuatious,  but  T  types  may 
not.  Thus,  the  different  levels  in  our  type  syntaix  spec¬ 
ify  the  restrictions  on  the  input  programs.  The  use  of 
syntactically-specified  restrictions  is  intended  to  com¬ 
municate  clearly  to  the  programmer  the  limitations  of 
the  type  reconstruction  system. 


We  use  FGV(r)  to  refer  to  the  free  general  type  vari¬ 
ables  of  r,  and  FTV(r)  to  refer  to  the  free  type  identi¬ 
fiers  of  r.  Similarly,  FGV(i4)  refers  to  the  free  general 
type  variables  of  the  type  schennes  in  the  assignment  A. 
We  define  Gen(.A,r),  as  follows: 

Definition  (Generalization).  The  generalization 
of  r  with  respect  to  A  (written  Gen(i4,r)),  is  the  type 
scheme  17  =  Voj.r,  where  {oi}  =  FGV(r)  —  FGV(A). 


4.1  Type  Schemes 

The  IFX  type  system  supports  the  generic  polymor¬ 
phism  found  in  ML,  as  well  as  the  explicit  polymorphism 
found  in  Reynolds’  second-order  polymorphic  lambda 
calculus.  In  order  to  provide  generic  polymorphism,  we 
define  type  schemes,  which  represent  the  generic  (i.e. 
general)  type  of  a  variable  which  is  permitted  multiple 
instantiations: 


4.2  The  Deduction  System 

The  type  reconstruction  rules  of  IFX  ar“  as  f>^lloW8; 
LAMBDA 

-f  (x  :  t>)  H  e  :  r 

A  h  (lambda  (z  ;  ti)  e)  ;  t»  — ♦  r 


Definition  (Type  Scheme) .  A  type  scheme  t)  is  a 
term  of  the  form 

Vai...a„.r, 

where  fti...Qn  are  the  generic  variables  of  r  €  T. 


APPL 


A  h  e  :  Ta  — ►  Tr 
A  I-  Ca  :  T-g 
A  F  (e  Cg)  ;  t. 


We  distinguish  V  and  V  deliberately;  V  binds  the 
generic  type  variables  of  a  type  scheme,  and  V  binds 
type  variables  within  a  type. 

Definition  (Alpha-renaming).  Types  r  and  r'  are 
alplia-renamable  (written  r  ~  r')  iff  some  renaming  of 
type  variables  bound  in  r  produces  r'. 

Definition  (Instantiation).  The  type  r'  is  an  in¬ 
stance  of  the  scheme  t]  =  Vqi. ..Q„.r  (written  rj  F  r')  iff 
there  are  monomorphic  /ii.../in  such  that  r[/i,7ai]  ~  r'. 
(We  extend  V  to  type  schemes  by  i]  y  rj'  iff  :  t}'  > 
r  =>  rj  ^  r.) 

Note  that  only  M  types  may  be  substituted  to  produce 
instantiations,  and  that  we  assume  that  substitution 
takes  place  with  renaming  of  any  bound  type  variables 
to  avoid  capture.  The  result  of  substituting  p  for  t  in 
r  will  be  written  T[p/f].  The  type  scheme  77  =  V.r, 
having  no  generic  type  variables,  will  occcisionally  be 
abbreviated  as  r. 

We  first  present  the  inference  ruh  s  for  explicitly  typed 
terms.  A  type  assignment  A  maps  each  variable  in  its 
domain  to  a  type  scheme.  We  will  use  A*  to  refer  to 
the  type  assignment  A  with  the  assignment  for  variable 
X  removed. 


PLAMBDA 

A  h  e  :  r 

A  h  e  :  r'  =>  Gen(A,  r)  X  Gen(A,  r') 

_ ti  t  FTV(A) _ 

AF  (plambda  (fj)  e)  :'dti.T 

PROJ 

A  F  e  :  Vfj.r 

A  F  (proj  e  Hi)  ;  r[vi/ti] 

The  above  rules  describe  the  typing  requirements  of 
value  abstraction,  value  application,  type  abstraction, 
and  type  application. 

The  following  rules  describe  the  typing  requirements 
of  our  open/close  operators,  ML-style  generic  let,  and 
the  automatic  type  application  of  polymorphic  func¬ 
tions.  These  typing  rules  permit  convenient  use  of  poly¬ 
morphic  values.  Note  that  only  M  types  are  synthe¬ 
sized  by  the  reconstruction  rules,  as  the  omitted  types  of 
lambda-bound  variables,  or  as  type  parameters  in  open 
or  generic  variable  instantiation.  The  T  types,  contain¬ 
ing  explicit  quantifiers,  must  be  provided  explicitly. 

OPEN 

A  F  e  ;  Vf,-.r 
A  F  (open  e)  ;  T[p,/f,] 
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CLOSE 


4.5  Discussion  of  the  CLOSE  Rule 


A  h  e  :  r 

Ah  e  :  r'  =>  Gen(i4,  r)  V  Gen(i4,  r') 

{g,}  =  FGV(r)  -  FGV(A) 

A  h  (close  e)  :  'iai.r 

VARJNST 

(x  :  Voi.r)  6  A 
Ah  x  :  r[/j,/g,] 

ILET 

Ah  ei,  :Tt 

Ah  et  :  rj,  =>  Gen(A,rt)  V  Gen(A,rj) 

Aj  +  (a;  :  Gen(A,  n))  heir 
A  h  (let  (a:  ej)  e)  :  r 

ILAMBDA 

Ag  +  {x  :  fi)h  e  :  T 
A  h  (lambda  (a;)  e)  :  n —*  t 

lAPPL 

Ah  e  :  V<i.(ry  — »  r^) 

Ah  Ca  :  Ta 

Tg  =  Tflfij/tj] 

A  P  (e  Ca)  :  Tr\pi/ti] 

4.3  Generic  let 

The  ILET  and  VARINST  rules  provide  the  ML-style 
generic  let.  ILET  associates  a  generic  type  scheme 
with  the  let-bound  variable,  and  VARINST  permits 
each  occurence  of  the  variable  to  be  independently  as¬ 
signed  any  instance  of  its  generic  type  scheme.  The 
convenience  of  automatic  generalization  and  instantia^ 
tion  are  provided  by  these  two  rules.  In  IPX,  the  typing 
rules  permit  this  convenience  with  the  caveat  that  the 
automatically  deduced  type  parameters  be  M  types. 

The  typing  power  of  the  ILET  rule  is  equivalent  to 
that  provided  by  rewriting  the  let  expression  in  the 
usual  way,  while  making  use  of  open  eind  close: 

((lambda  (z  :  r)  c[(open  i)/z])  (close  ej)). 

However,  this  transformation  is  not  pure  syntactic 
sugar,  because  it  requires  r,  the  explicitly  polymorphic 
type  of  the  bound  variable. 

4.4  Implicit  Projection 

The  lAPPL  rule  illustrates  how  implicit  type  applica¬ 
tion  of  polymorphic  functions  may  be  provided.  The 
implicit  instantiation  is  achieved  by  making  use  of  the 
open  operator  to  obtain  a  generic  type.  The  application 
(e  Co)  is  typed  by  lAPPL  as  if  written  ((open  e)  Co). 


The  open  and  close  operators  provide  the  program¬ 
mer  with  the  means  to  convert  types  between  the  ex¬ 
plicitly  quantified  style  and  the  generic  polymorphism 
style.  The  open  operator  converts  an  explicitly  quan¬ 
tified  polymorphic  type  into  one  of  its  instantiations. 
Using  open  can  be  understood  as  requesting  a  type  ap¬ 
plication  to  automatically  determined  M  type  param¬ 
eters.  Therefore,  open  is  more  convenient  to  use  than 
proj,  but  proj  must  be  used  when  application  to  T 
types  is  desired. 

The  close  operator  takes  an  expression  whose  type 
contains  unbound  general  type  variables,  and  performs 
type  abstraction  with  respect  to  those  type  variables. 
The  IFX  operator  close  hM  the  same  intended  seman¬ 
tics  as  McCracken’s  type  closure  operator,  with  one  im¬ 
portant  difference:  the  IFX  close  operator  acts  only  on 
the  most  general  type  of  the  expression.  This  restriction 
is  enforced  by  the  CLOSE  rule,  which  conteuns  as  an  an¬ 
tecedent  that  the  type  used  for  the  closure  be  the  most 
general  type  of  the  expression.  This  restriction  was  not 
included  in  [McCracken84],  and  precludes  the  complete¬ 
ness  of  the  typechecking  algorithm  W  presented  in  that 
work. 

4.6  Only  the  Most  General  Type 

The  difficulty  with  the  CLOSE  rule  (and  similarly 
PLAMBDA  and  ILET),  is  that  in  the  absence  of  the 
most  general  type  restriction,  a  more  specific  type  may 
be  chosen,  leading  to  a  difference  in  the  form  of  the 
explicitly  quantified  polymorphic  type.  Consider  the 
following  simple  example: 

(proj  (close  (lambda  (z)  z))  s  r). 

The  algorithm  given  in  [McCracken84],  and  our  algo¬ 
rithm,  will  fail  on  the  above  example.  The  reason  is 
that  the  natural  typing  of  (close  (lambda  (z)  z)) 
is  Vf.<  — ►  f,  which  cannot  be  applied  to  two  types  (« 
and  r),  but  only  to  a  single  type.  Without  the  most 
general  type  restriction  in  the  CLOSE  rule,  the  rule 
would  indicate  that  we  may  instead  deduce  the  type 
(ti —*  ti) -*  ti ti  for  (lambda  (z)  z).  This  type 
would  be  closed  to  produce  Vfi  -+  <2)  h, 

which  could  be  applied  to  the  two  types  s  and  r. 

The  CLOSE  typing  rule  in  IFX  imposes  the  require¬ 
ment  that  the  type  closed  be  the  most  general  type  of 
the  body.  This  antecedent  solves  the  problem  described 
above.  Also,  for  purposes  of  disambiguation,  the  type 
variables  are  bound  in  the  same  order  in  which  they 
appear  in  the  type  of  the  expression.  These  conditions 
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ensure  that  the  explicitly  polymorphic  type  of  a  close 
expression  is  “frozen,”  and  will  have  a  fixed  polymor¬ 
phic  structure  which: 

•  Can  be  relied  upon  by  the  programmer;  so  that 
there  is  no  ambiguity  as  to  the  form  of  the  type. 

•  Avoids  the  need  to  extend  the  instantiation  relation 
to  structurally  different  polymorphic  types. 

•  Allows  the  type  reconstruction  algorithm  to  com¬ 
pute  the  unique  most  general  type  when  processing 
a  close  expression. 

This  last  item  guarantees  that  the  algorithm  need  make 
no  arbitrary  choices  thus  avoiding  the  need  for  back¬ 
tracking. 

5  A  Type  Reconstruction 
Algorithm 

The  type  reconstruction  algorithm,  R,  computes  types 
for  IFX  expressions  that  are  consistent  with  the  deduc¬ 
tive  rule  system  in  section  4.  Algorithm  R  manipu¬ 
lates  type  expressions  which  contain  unification  vari¬ 
ables.  We  have  chosen  to  represent  unification  variables 
explicitly,  to  avoid  any  possible  confusion  regarding  the 
status  of  particular  type  veiriables  occurring  within  type 
expressions.  To  summarize  the  notation: 

t  Explicitly  quantified,  bound  type  variable. 
a  Generic  type  variable,  implicitly  bound. 
fi  Monomorphic  type,  possibly  generic,  omittable. 
r  Unrestricted  type. 

u  Unification  variable,  which  represents  an  M  type 
being  reconstructed. 

Because  a  i/  represents  an  M  type,  it  cannot  be  unified 
with  type  expressions  containing  V. 

Our  type  reconstruction  algorithm  makes  extensive 
use  of  a  unification  algorithm  for  this  type  system. 
Our  unification  algorithm  and  its  implementation  are 
based  on  the  work  of  [Morris68,Hindley69].  As  in  [Mc- 
Cracken84],  we  require  that  the  unification  algorithm 
be  extended  to  support  alpha-renaming  of  bound  type 
variables,  including  generic  type  variables.  To  state  the 
substitution  lemma  correctly,  we  must  prevent  the  pos¬ 
sibility  of  user  type  identifier  capture,  so  we  make  use  of 
BTV(e),  the  type  identifiers  bound  by  plaabda  within 


e.  We  say  that  5  is  a  valid  M-substitution  for  A  and  e 
if  S  maps  type  variables  in  FGV(A)  to  M  types  which 
do  not  contain  any  BTV(£'). 

Lemma  (Substitution).  Given  a  type  assignment  A, 
expression  e,  type  r,  cind  a  valid  M-substitution  S: 

A  h  e  :  r  =>  SA  t  e  :  Sr 

Proof:  By  structural  induction  on  e.  □ 

Lemma  (Extended  Unification).  There  exists  an 
algorithm  with  these  properties: 

«  Supports  alphar renaming  of  bound  type  variables, 
including  generic  variables. 

•  Matches  unification  variables  only  with  M  types. 

•  Returns  an  M-subetitution  unifying  t  and  t'  or 
fails,  according  to  the  usual  unification  rule  defined 
by  V. 

If  there  exists  an  M-substitution  unifying  r  and  r',  then 
U (r,  r')  will  return  the  most  general  unifying  substitu¬ 
tion: 

Sr  ~  St' 

=>■ 

U{r,r')-*S 
Sr  ~  Sr' 

HP:(PS=;S), 

and  otherwise  U{r,r')  will  fail: 

VS  :  5r  Sr' 

=» 

U{T,r')  — ►  failure. 

Proof:  Straightforward.  □ 

Algorithm  R.  The  reconstruction  algorithm  is  defined 
as  a  recursive  procedure.  Algorithm  R(A,e)  takes  the 
type  assignment  A  and  the  input  expression  e,  and  com¬ 
putes  (S,t}.  The  algorithm  fails  if  any  of  the  invoca¬ 
tions  of  the  unification  algorithm  fail,  if  any  of  the  recur¬ 
sive  invocations  of  R  fail,  or  if  failure  is  specifically  in¬ 
dicated  in  the  algorithm  text,  given  below.  The  substi¬ 
tution  S  is  an  M-substitution  on  FGV(A).  The  substi¬ 
tution  incorporates  any  information  about  the  types  of 
the  variables  in  dom(A)  which  is  discovered  by  travers¬ 
ing  the  expression  e.  The  design  is  such  that  r  is  the 
most  general  type  of  e  under  the  assignment  SA,  and  S 
is  the  minimal  substitution  on  FGV(j4)  which  permits 
c  to  be  typed. 
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R{A,t) 


=  case  e  oi ; 

X 

if  (z  :  Voi.r)  ^  A  then  fail 
return  {^,  r[j/,/a,]) 

where  Vi  are  fresh. 

(proj  c  Vi) 

let  (5,  r)  =  R{A,e) 
if  r  Vtj.r'  then  fail 
return  {5, r'[u, •/<,]}. 

(plambda  (t,)  e) 
let  {5,r}  =  R{A,e) 
if  any  U  €  FTV(5.A)  then  fail 
return  {SyU-r). 

(lambda  (x  :  u)  e) 

let  {S,Tr)  =  R{A-)-  (z  ;  ti),e) 
return  (S,u  — » 


6  Formal  Properties  of  the 
Reconstruction  Algorithm 

An  algorithm  is  a  type  reconstruction  algorithm  for 
IFXiff  the  algorithm  has  certain  properties;  iermina' 
Uon,  soundness ,  and  completeness.  Each  of  these  prop¬ 
erties  may  be  viewed  as  a  guarantee  to  the  programmer 
about  the  behavior  of  the  algorithm; 

•  The  algorithm  will  always  halt,  either  failing,  or 
providing  a  type  of  the  prograim. 

•  If  the  algorithm  computes  a  type  for  the  program, 
then  the  program  can  be  proved  to  have  that  type 
according  to  the  formal  rules  of  IFX. 

•  If  the  program  has  some  type  according  to  the  for¬ 
mal  rules  of  IFX,  then  the  algorithm  will  compute 
the  most  general  type  of  the  program. 


(lambda  (z)  e) 

let  (5,r)  =  R{A  +  (z  ;  i'),e) 
where  v  is  fresh 
return  {S,Su  — ►  r). 

(c  Ca) 

let  (S,Tm)  =  ft(A,e) 
let  (Sa,ra)  =  fl(5A,Ca) 
if  Tm  =  Vti  .t'  then 

r  =  T'[vi/ti\  where  t/,-  arc  fresii 

else 

let  T=Tm 

if  T  T/  —*  Tr  then  fail 
\eiV  =  U{SaTj,T^) 
return  (VSoS,  VSaTr). 

(open  e) 

let  {S,t)  =  R{A,e) 
if  r  ^  Vtj  .t'  then  fail 
return 

where  !/<  are  fresh. 

(close  e) 

let  (5,  r)  =  R{A,e) 

let  {tti}  =  FGV(t)  -  FGV(5A) 

return  (Syai.r). 

(let  (z  c»)  c) 

let  (Sk,n)  =  H(A,ej) 
let  {a<}  =  FGV(n)  -  FGyiSiA) 
let  {S,  t)  =  R(Si,A  -b  (z  ;  Va,-.Tj),  e) 
return  (SSi,,r). 
endcase. 


VVe  prove  that  Algorithm  R  has  each  of  these  three  prop¬ 
erties.  For  convenience,  we  will  write  R{A,e)  —*  (S,  t) 
when  we  mean  “Algorithm  R  halts  on  input  A  and  c, 
yielding  S  and  t.” 


6.1  Termination  of  Algorithm  R 

Theorem  (Termination).  Algorithm  R  terminates  on 
all  (finite)  inputs. 

Proof:  Observe  that  Algorithm  R  is  a  syntax-directed 
algorithm.  R{A,e)  recursively  traverses  the  syntax  of 
the  input  expression  e,  which  is  finite.  Algorithm  R 
invokes  the  unification  algorithm  finitely  many  times, 
and  each  invocation  must  terminate,  because  ail  type 
expressions  in  IFX  are  finite.  □ 


6.2  Correctness  of  Algorithm  R 

The  notions  of  soundness  and  completeness  of  a  typing 
algorithm  with  respect  to  a  formal  typing  system  are 
well-known.  The  most  general  type  restriction  in  the 
antecedents  of  several  of  the  IFX  typing  rules  compli¬ 
cate  the  proofs  of  the  soundness  and  completeness  of 
R.  Specifically,  the  soundness  of  R  and  the  complete¬ 
ness  of  R  cannot  be  proved  independently,  as  is  usually 
done  [DamasSS].  Consider,  for  example,  the  expression 
(close  e).  R  on  (close  e)  cannot  be  sound  unless 
the  type  computed  by  R  for  c  satisfies  the  most  gen¬ 
eral  type  restriction,  which  means  R  must  be  complete. 
Similarly,  the  soundness  of  R  is  used  to  prove  its  com¬ 
pleteness. 
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The  argument  is  not  circular,  but  mutually  recursive. 
In  other  words,  the  soundness  and  completeness  of  R 
must  be  proved  together,  by  structural  induction. 

6.3  Soundness  of  Algorithm  R 

We  show  that  Algorithm  R  is  sound  with  respect  to  the 
formal  typing  system  in  this  sense; 

Theorem  (Syntactic  Soundness).  Given  any  type 
assignment  A  and  expression  e,  if  Algorithm  R  on  input 
A  and  e  computes  the  substitution  S  and  the  type  r, 
then  e  has  type  r  under  the  assignment  5A: 


5A  t-  e  :  r. 

Proof  sketch:  By  structural  induction  on  e,  in  tandem 
with  the  proof  of  completeness.  Assume,  by  induction, 
that  R  is  both  sound  and  complete  when  applied  to  the 
component  expressions  of  c.  For  each  possible  case  of 
Algorithm  R,  there  is  a  corresponding  inference  rule  in 
IFX.  In  each  case,  the  antecedents  of  the  inference  rule 
follow  from  the  inductive  hypothesis  and  the  unification 
lemma.  □ 

6.4  Completeness  of  Algorithm  R 

W’c  show  that  Algorithm  R  is  complete  with  respect  to 
the  formal  typing  system  in  the  following  sense: 

Theorem  (Syntactic  Completeness).  if  e  has 
type  r  under  assignment  SA,  where  5  is  a  valid  M- 
substitution  for  A  and  e,  then  R{A,e)  will  compute  a 
substitution  S  and  a  type  r,  such  that  SA  is  more  gen¬ 
eral  than  5A,  and  f  is  more  general  than  r; 

SA  h  e  :  r 

=> 

R(A,e)  —  (5,  r) 

3P  :  (PSA  ~  5A)  A  (PGen(5>4,  f)  V  Gen(5/1,  r)). 

Proof  sketcli:  By  structural  induction  on  c,  in  tan¬ 
dem  with  the  proof  of  soundness.  Assume,  by  induction, 
that  R  is  both  sound  and  complete  when  applied  to  the 
component  expressions  of  e.  Given  that  SA  h  e  :  r, 
the  final  step  in  the  formal  deduction  uses  a  typing 
rule  of  IFX.  For  each  possible  typing  rule,  there  is  a 


corresponding  case  in  Algorithm  R.  In  each  case,  the 
antecedents  of  the  typing  rule  permit  the  use  of  the  in¬ 
duction  hypothesis.  The  behavior  of  R  on  input  c  then 
follows  from  the  application  of  the  inductive  hypothesis 
to  the  recursive  calls  to  /i  on  component  expressions  of 
c,  and  the  unification  lemma. 

The  detailed  proof  of  the  inductive  step  for  (doss  c) 
illustrates  the  use  of  the  induction  hypothesis,  the  con¬ 
nection  with  the  tandem  proof  of  soundness,  and  the 
importance  of  the  most  general  type  restriction  as  an 
antecedent  of  the  CLOSE  rule 


Case  CLOSE.  Given  that 

5A  h  (doss  e)  :Va,.r,  (1) 

we  will  show: 

R(A,  (doss  e))  — »  (S,Vay.f),  (2) 

and  demonstate  the  existence  of  a  substitution  P*  such 
that; 

P'SA  ~  SA  (3) 

P'Gen(SA,Voj  .r)  S  (Jen(5A,  Va^.r).  (4) 

The  IFX  deduction  rule  for  (doss  c)  proving  (1)  is 
CLOSE,  so  therefore 

SAh  e  :  r  (5) 

{cr,}  =  FGV(r)  -  FGV(5A)  (6) 

SA  h  e  :  r'  =>  Gen(5A,  t)  >  Gen(Si4,  /).  (7) 

By  the  inductive  hypothesis  cn  (5),  there  exists  a  sub¬ 
stitution  P  such  that: 

R(A,e)-(5,f)  (8) 

PSA  ~  SA  (9) 

PGen(SA,  f)  >  Gen(SA,  r)  (10) 

5Ahc:f  (11) 


In  order  to  show  (4),  we  will  use  the  most  general 
type  restriction  to  show  that  the  two  type  schemes  in 
(10)  are  ~.  By  the  substitution  lemma,  it  follows  from 
(9)  and  (11)  that 

SA  ~  PSA  he:Pf  (12) 

But  T  satisfies  the  most  general  type  restriction  (7),  so 
the  above  implies 

Gen(5A,r)  ^  Gen(5A,  Pf).  (13) 
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Combining  (10)  and  (13)  we  have 

PGen(5i4,  f)  y  Gen(S/l,  r)  V  Gen(5i4,  Pt).  (14) 

Now  dom{P)  C  FGV(Si4),  so  by  definition  of  Gen,  it  is 
clear  that 

PGen(5A,f)  ~Gen(PS/l,  Pf).  (15) 

Thus,  by  (9)  and  (15),  the  middle  of  (14)  is  excluded, 
so  that  the  type  schemes  of  (14)  are  all  ~,  specifically 

PGen(5yl,7-)  ~Gcn(5/l,r).  (16) 

We  must  show  that  Algorithm  R  on  (close  e)  com¬ 
putes  a  substitution  and  a  type  having  the  proper¬ 
ties  specified  in  the  theorem.  By  definition  of  R  on 
(close  e),  it  follows  from  (8)  that 

P(A,  (close  e)) —>  (S,V.Vj.f),  (17) 

where  {oj}  —  FGV(f)  —  FGV(i'>4).  We  must  also  find 
a  substitution  P'  such  that; 

P'^A  ~  5A  (18) 

P'Gen(5A,Vaj.f)  X  Gcn(5A, Vcr.-.r).  (19) 

We  choose  P'  =  P  and  observe  that  (18)  follows  from 
(9).  By  the  choice  of  {a<}  and  {ay},  it  is  obvious  that 
the  Gen  operations  in  (19)  do  not  bind  any  generic  type 
variables.  Therefore,  (19)  reduces  to  (16),  by  the  defi¬ 
nition  of  V.  We  have  shown  (2),  (3),  and  (4),  and  this 
completes  the  proof.  O 

7  Possible  Extensions 

The  typing  system  of  IFX  may  be  extended  to  provide  a 
richer  type  language.  For  example,  the  FX-89  design  in¬ 
cludes  static  side-effect  information,  module  values,  the 
usual  sum  and  product  types,  and  recursive  definitions. 
Various  extensions  are  briefly  discussed. 

7.1  Recursive  Definitions 

An  important  omission  from  the  IFX  typing  rules  is 
any  means  to  define  values  recursively.  The  following 
two  rules  provide  both  implicitly  and  explicitly  typed 
versions  of  letrec: 

hFTHEC 

Ag  +  {x  :  t;*)  h  cj  :  Uj, 

Ar  +  (x  :  Vj)  h  e  :  T 
A  h  (letrec  (x  :  vj  e^)  e)  :  r 


1  LETREC 

Ar  +  {x  :  ft)  :  ft 

Ar  +  (x  :  ft')  h  Cl,  :  ft'  Gen(A*,/i)  X  Gen(Ar,/i') 

_ Ax  +  (x  :  Gen( Ar,/i))  h  e  ;  t _ 

A  H  (letrec  (x  et)  e)  :  r 

The  LETREC  rule  requires  that  the  programmer  spec¬ 
ify  the  type,  but  provides  no  generic  polymorphism. 
The  ILETREC  rule  provides  generic  polymorphism,  but 
does  not  permit  T  types  for  the  bound  variable.  Neither 
rule  permits  the  bound  variable  to  be  generic  within  its 
own  defining  expression.  We  believe  that  algorithm  R 
extends  to  these  rules  in  a  straightforward  manner. 

The  following  rule  combines  the  good  features  of  the 
above  rules: 

LR2 

A*  -t-  (x  :  n)  h  c»  :  n 

A*  (x  :  r()  h  Ch  :  r(  =»  Gen(A*,n)  h  Gen(Ar,T^) 

_ Aj  -h  (x  :  Gen(Ag,  n))  F  e  :  t _ 

A  h  (letrec  (x  ej)  e) : r 

The  LR2  rule  permits  the  omission  of  the  T  type  of  the 
bound  variable.  We  have  not  extended  Algorithm  R  to 
compute  types  consistent  with  rule  LR2.  Whether  such 
types  can  be  computed  is  an  open  question. 


7.2  Implicit  Projection 

The  version  of  our  lAPPL  rule  presented  in  [Mc- 
Cracken84]  does  not  restrict  the  type  arguments  to  be 
M  types,  but  rather  requires  that  all  type  arguments  ap¬ 
pear  in  the  types  of  the  formal  subroutine  arguments; 


IPA 

A  I-  e  :  V<i.(r/  r^) 

Ah  Ca  ■  Ta 

{D  C  FTV(r/) 

T-g  =  rf[Ti/U] 

A  h  (e  Ca)  ■  Trln/',] 

This  extension  corresponds  to  a  straightforward  modi¬ 
fication  to  algorithm  R  and  the  unification  algorithm. 
However,  the  completeness  of  the  resulting  algorithm 
remains  an  open  question. 
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7.3  Type  Closure  w.r.t. 

User  Type  Identifiers 

The  CLOSE  rule  of  IFX  does  not  permit  closure 
with  respect  to  type  variables  which  appear  in  user- 
supplied  types.  For  this  reason,  the  type  variables  which 
the  programmer  chooses  to  write  explicitly  must  alse 
be  abstracted  explicitly,  via  plambda.  For  example, 
(close  (launbda  (x/  x))  :  Vt.t  — ►  t,  but  we  do  not 
have  (close  (lambda  (x  :  <)  x))  :  Vt.<  — *  t  and  in¬ 
stead  must  write  (plambda  (t)  (lambda  (x  :  <)  x)). 

The  type  reconstruction  algorithm  R  can  be  mod¬ 
ified  to  perform  type  closures  with  respect  to  these 
user-supplied  type  identifiers,  and  the  resulting  flexi¬ 
bility  may  be  desirable.  Unfortunately,  we  have  not 
found  a  typing  rule  for  close  which  both  permits 
(close  (lambda  (x  ;  t)  x))  :  Vt.t  — » t  and  prevents 

(lambda  (y) 

(close  (lambda  (x) 

(lambda  (h  :  t) 

((lambda  (z)  x)  y))))) 

:  t  —>  Vs.(s 

The  diff.culty  is  that  if  the  rule  for  close  does 
permit  abstraction  with  respect  to  user-supplied  type 
identifiers,  then  the  behavior  of  close  depends  upon 
the  choice  of  omitted  M  types  on  lambda-bound  vari¬ 
ables.  Particular  choices  of  omitted  M  types  will  enlarge 
FTV(^),  thus  preventing  some  type  abstractions. 

In  the  above  example,  if  y  is  assigned  the  type  t,  then 
t  is  not  abstractable  by  the  close  appearing  in  the 
body,  because  it  appears  free  in  the  type  assignment. 
However,  the  modified  algorithm  R  would  not  eissign 
y  the  type  t,  because  the  type  of  y  is  not  constrained 
by  the  example.  Therefore,  closure  with  respect  to  t 
would  be  performed.  This  problem  does  not  occur  with 
generic  type  variables,  because  they  cannot  appear  in 
the  types  supplied  by  the  programmer.  This  may  be 
seen  by  comparing  the  definitions  of  the  type  domains 
U  and  T. 

7.4  Effect  specifications 

I'he  type  language  of  F'X-87  provides  side-effect  specifi¬ 
cations,  which  permit  a  fine-grained  description  of  allo¬ 
cation,  access,  and  mutation  actions  performed  during 
the  evaluation  of  an  expression.  These  effect  specifica¬ 
tions  are  attached  to  function  types,  in  order  to  permit 
accurate  checking  at  the  time  of  compilation. 

'Hie  IFX  typing  rules  permit  monomorphic  types  to 
be  omitted  from  formal  argument  type  declarations.  Ef¬ 


fect  specifications  can  be  included  in  such  a  classifica¬ 
tion;  such  an  approach  is  adopted  by  the  current  design 
of  FX-89. 

7.5  Applicative  Types 

In  FX-87,  type  abstraction  is  permitted  only  when  the 
side-effect  specifications  ensure  that  the  polymorphic 
expression  is  referentially  transparent.  [Tofte87]  takes  a 
different  approach,  based  on  the  concept  of  applicative 
types.  Tofte  classifies  certain  expressions  as  expansive, 
and  permits  type  abstraction  of  these  expressions  only 
with  respect  to  applicative  type  variables.  This  type  ab¬ 
straction  rule  permits  different  type  abstractions  than 
does  the  FX-87  pure-plambda  rule. 

A  thorough  comparison  of  these  two  abstraction  rules 
is  beyond  the  scope  of  this  paper.  [Tofte87]  treats  the  is¬ 
sue  of  type  abstraction  in  the  presence  of  mutable  data. 
We  believe  that  the  imperative  typing  discipline  intro¬ 
duced  in  [Tofte87]  could  be  combined  with  the  type  re¬ 
construction  system  we  have  presented. 

7.6  IFX  as  a  Design  Point 

The  language  IFX  may  be  viewed  as  one  point  in  a  de¬ 
sign  space  of  languages.  The  rules  comprising  IFX  pro¬ 
vide  certain  language  constructs  which  define  the  be¬ 
havior  and  interactions  of  generic  and  explicitly  poly¬ 
morphic  types.  However,  other  typing  rules  are  also 
possible.  For  example,  the  VARJNST  rule  could  be 
changed  to  (in  effect)  automatically  open  any  lambda- 
bound  variable,  except  where  some  syntactic  device  ap)- 
peaxs. 

The  particular  rules  chosen  will  influence  the  pro¬ 
grammer’s  use  of  polymorphism,  as  some  usages  are 
made  more  convenient  than  others.  Choosing  among 
such  design  alternatives  is  an  engineering  decision  which 
may  require  empirical  investigation. 


8  Conclusion 

We  have  developed  a  type  reconstruction  system  which 
combines  the  convenience  of  ML-style  typing  with  the 
full  explicit  typing  power  of  the  second-order  polymor¬ 
phic  leimbda  calculus.  The  system  permits  the  coexis¬ 
tence  in  the  type  domain  of  generic  and  explicitly  poly¬ 
morphic  types,  and  provides  operators  with  which  the 
programmer  may  control  conversions  between  the  two 
forms  of  polymorphism.  We  have  introduced  the  “most 
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general  type”  restriction  for  type  abstractions,  permit¬ 
ting  the  use  of  a  unification-based  typing  algorithm. 
A  practical  type  reconstruction  algorithm  for  this  lan¬ 
guage  has  been  exhibited  and  its  correctness  has  been 
proven. 
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